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DETAILED ACTION 
Response to Arguments 

1 . Applicant's arguments, see Remarks, filed 1 1/08/04, with respect to claims 21-29 
have been fully considered and are persuasive. The rejections of claims 21-29 have 
been withdrawn. 

Applicant's arguments filed 11/08/04 regarding claims 1-20 have been fully 
considered but they are not persuasive. 

Regarding applicant's arguments, page 4, para. 4, concerning claims 1 and 
similar claims 19 and 20-having similar features, " ...if Yamada were combined with 
Ellozy, syllables are not indexed and stored for use in searching a database in response 
to a user query in the combination." 

However, Yamada teaches indexing a Chinese/Japanese index based on 
syllable, which is a minimal unit of language (c.4.45-c.5.9-previously cited in previous 
action). The applicant claims "...semantic units for use in searching ..." Yamada 
teaches, C.7.14, 15, "an index indicator that provides an index for assisting a search for 
a desired data item. Therefore, the Examiner interprets the indexed syllables as being 
used in searching, which meets the limitations as claimed and thus the previous 
rejection remains valid. 

Regarding applicant's arguments concerning claims 8, and 9. The applicant's 
alleged deficiencies with respect to claim 1 have been clarified above. The applicant 
states, "Orsolini does not index and search based on semantic units, but rather the user 
chooses a keyword . . . (which is then) used to query the text balanced tree for each 
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recording' 1 (col. 5, lines 28-43)." However, the Examiner does not rely on Orsolini for the 
defined semantic units, rather as previously cited in the previous action, 

"processing the user query to generate one or more semantic units 
representing the information that the user seek to retrieve"; col. 1, line 65 to col. 2, line 
9, the user choose a keyword and used to query the text balanced tree for each 
recording, col. 5, lines 28-43). 

searching the one or more indexed semantic units to find a substantial match 
with the one or more semantic units associated with the user query; (col. 2, lines 3-9., 
col. 5, lines 45-55); and 

retrieving one or more segments of the audio-based data using the one or more 
indexed semantic units that match the one or more semantic units associated with the 
user query" ( col. 2, lines 10-24, col. 5, lines 45-61)." 

Therefore, the Examiner relies on Orsolini's searching system as taught by 
Orsolini et al because it would provide efficient content searching of recordings. As per 
claim 9, Orsolini et al teach wherein the searching step further comprises presenting the 
retrieved data to the user (col. 5, lines 45-48). 

Regarding applicant's arguments concerning claim 14, page 5 para. 3, the 
Examiner has provided a citation of the limitation below. 

Specification 

2. The disclosure is objected to because of the following informalities: 
On page 8, line 12, after "aud5,", "aud6" should be - - aud7- -. 
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On page 8, line 13, "6 th " should be - - 7 th - -. The lengthy specification has not 
been checked to the extent necessary to determine the presence of all possible minor 
errors. Applicant's cooperation is requested in correcting any errors of which applicant 
may become aware in the specification. 

Appropriate correction is required. 

Claim Rejections - 35 USC §112 

3. The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 

4. Claim 19 is rejected under 35 U.S.C. 112, first paragraph, as failing to comply 
with the enablement requirement. The claim(s) contains subject matter which was not 
described in the specification in such a way as to enable one skilled in the art to which it 
pertains, or with which it is most nearly connected, to make and/or use the invention. 

The claim recites, "at least one processor..." but lacks means for performing the 
various operations that the processor performs. 

A single means claim, i.e., where a means recitation does not appear in 
combination with another recited element of means, is subject to an undue breadth 
rejection under 35U.S.C. 112, first paragraph. In re Hyatt, 708 F.2d 712, 714-715, 218 
USPQ 195, 197(Fed. Cin 1983) (A single means claim which covered every 
conceivable means for achieving the stated purpose was held nonenabling for the 
scope of the claim because the specification disclosed at most only those means known 
to the inventor.). When claims depend on a recited property, a fact situation comparable 
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to Hyatt is possible, where the claim covers every conceivable structure (means) for 
achieving the stated property (result) while the specification discloses at most only 
those known to the inventor. 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 1-13, and 15-20 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Ellozy et al (Ellozy US 5,649,06) in view of Yamada (US 6,166,733). 

As per claim 1 and 19-20, Ellozy et al teach a method of processing audio-based 
data associated with particular language, the method comprising (figure 3): 

"Storing the audio-based data" (his Audio/Video recording 12, col. 5, lines 5-20),. 

"Generating a textual representation of the audio-based data the textual 
representation being in the form of one or more semantic units corresponding to the 
audio-based data" (his Automatic Speech Recognizer 31 and his Decoded Text 38,. col. 
5, lines 30-35),. and 

"indexing the one or more semantic units and storing the one or more indexed 
semantic units for use in searching the stored audio-based data in response to a user 
query" (his indexing 60, co1.7, lines 13-20). 

It is noted that Ellozy teaches the claimed invention but does not explicitly teach 
wherein a semantic unit comprises a minimal unit of language having a semantic 
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meaning. However, this feature is well known in the art as evidenced by Yamada who 
teaches at col. 4, lines 45 to col. 5, lines 9, indexing a Chinese/Japanese index based 
on syllable which is a minimal unit of language, Yamada teaches, C.7.14, 15, "an index 
indicator that provides an index for assisting a search for a desired data item. Therefore, 
the Examiner interprets the indexed syllables as being used in searching. Therefore, 
one having ordinary skill in the art at the time the invention was made would have it 
obvious to recognize that the keyword/keyphrase based indexing of Ellozy could be 
further indexed based on syllable as taught by Yamada because it would facilitate the 
sorting and would save space in the memory allocation. 

As per claim 5, Ellozy et al teach wherein the generating step comprises 
decoding the audio-based data in accordance with a speech recognition (figure 3, his 
automatic Speech Recognizer 34, col. 5, line 30-32). 

As per claim 6, Ellozy teach wherein the speech recognition system employs a 
semantic unit based language model (col. 6, lines 47-65, his word language model). 

As per claim 7, Ellozy teach "wherein the indexing step comprises time stamping 
the one or more semantic units" (col. 5, lines 47 to *1 . 6, line 30, his time stamping of 
the indexed words). 

As per claim 15, Ellozy teach "wherein the one or more semantic units are 
indexed according to at least one of when the audio based was produced and where the 
audio based data was produced" (figure 3, his time alignment 42). 

As per claim 23, Ellozy teaches "the user query comprises a word" (C.2.line 40 
target recognized word). 
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As per claim 27, Ellozy teaches "the generating step comprises producing the 
textual representation via stenography" (C.1. lines 43-48, C.5.lines 14, 15). 

As per claim 28, Ellozy teaches "the searching step (C.2.lines 39, 48-the 
searching) comprises use of a hierarchical index (C.3.lines 54-56-ordered series of 
' index-hierarchical index). 

As per claim 29, Ellozy teaches "the searching step comprises use of an 
automatic boundary marking system (Fig. 5, dO.lines 10-31 -the time stamp 
automatically sets boundaries used in searching). 

As per claims 2-4 Yamada teaches "wherein the semantic unit is a syllable, 
wherein the syllable is a phonetically based syllable"; and wherein the semantic unit is a 
morpheme ( col. 4, lines 45 to *1 . 5, lines 9). 

Therefore, it would have been obvious to modify Ellozy with Yamada by including 
indexed syllables used in the searching. The motivation for doing so would have been to 
facilitate faster searching (c.2.lines 55-57). 

As per claim 21 , Yamada teaches "employing a syllable language model" 
(c.4.lines 20-24). Therefore, it would have been obvious to modify Ellozy with Yamada 
by having the speech recognition system of Ellozy employ the syllable language model 
of Yamada in place of Ellozy's language model. The motivation for doing so would have 
been to facilitate faster searching (c.2.lines 55-57). 

7. Claims 21 , 22, and 24-26 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Ellozy et al (Ellozy US 5,649,06) in view of Yamada (US 6,166,733), 
and further in view of Lee (US 5,220,639). 
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As per claim 21 , Ellozy in view of Yamada do not teach transcribing audio data to 
generate syllables, deriving conditional probabilities of distribution based on the 
generated syllables, and using syllable counts and the conditional probabilities to 
construct the syllable language model. 

However, Lee teaches transcribing audio data to generate syllables (Fig. 1 -input 
speech-output-is the transcription), deriving conditional probabilities of distribution 
based on the generated syllables, and using syllable counts and the conditional 
probabilities to construct the syllable language model (Fig. 1 -input speech, Fig. 2-HMM, 
Initial/transitional/observational probabilities interpreted as the conditional probabilities 
of distribution based on the generated syllables, C.3. lines 25-27-character syllables, 
C. 7. line 27-C.8.line 6-count occurrence frequencies characters/syllables, Markov Model 
Chinese Language Model-the counts and probabilities are used in the construction of 
the syllable language model. Therefore, it would have been obvious to modify Ellozy 
and Yamada with Lee by producing a syllabic language model of Yamada in the well 
known manner of Lee. The motivation for doing so would have been to correctly 
transcribe input speech (C.2.lines 3-6). 

As per claim 25, Lee further teaches a phonetically-based syllable comprises a 
toneme (CAIines 33-35, 39,40-tone, syllable). 

As per claim 26, Lee further teaches two or more different pronunciations are 
associated with a phonetically-based syllable (CAIines 31-37-the multiple 
pronunciations "ba-1, ba-2" are associated with a phonetically-based syllable reducing 
the 1300 to 400-phonetically-based syllables). 
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As per claim 24, Ellozy and Yamada do not explicitly the searching step further 
comprises transforming the word into a sequence of syllables using a text-to-phonetic 
syllable map. However, Lee further teaches "transforming a word into a sequence of 
syllables using a text-to-phonetic syllable map" (C. 7. lines 43-46-computer is 
transformed to a sequence of syllables, from text-to-phonetic syllables-necessarily 
comprising a map). Therefore, it would have been obvious to modify Ellozy's search 
method by transforming the query word into syllables. The motivation for doing so would 
have been to have a syllabic description of an input word, for use in searching which 
enhances the speed in searching (Yamada C.4. lines 45-53). 

Conclusion 

8. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Lamont M. Spooner whose telephone number is 
571/272-7613. The examiner can normally be reached on 8:00 AM - 5:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on 571/272-7602. The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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